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Abstract 

The availability of hyperspectral infrared remote sensing instru- 
ments, like AIRS and lASI, on board of Earth observing satellites 
opens the possibility of obtaining high vertical resolution atmospheric 
profiles. We present an objective and simple technique to derive the 
parameters used in the optimal estimation method that retrieve at- 
mospheric states from the spectra. The retrievals obtained in this 
way are optimal in the sense of providing the best possible valida- 
tion statistics obtained from the difference between retrievals and a 
chosen calibration/ validation dataset of atmospheric states. This is 
demonstrated analytically. To illustrate this result several real world 
examples using lASI retrievals fine tuned to ECMWF analyses are 
shown. The analytical equations obtained give further insight into 
the various contributions to the biases and errors of the retrievals and 
the consequences of using other types of fine tuning. Retrievals using 
lASI show an error of 0.9 to 1.9 K in temperature and below 6.5 K 
in humidity dew point temperature in the troposphere on the verti- 
cal radiative transfer model pressure grid (RTIASI-4.1), which has a 
vertical spacing between 300 and 400 m. The more accurately the 
calibration dataset represents the true state of the atmosphere, the 
better the retrievals will be when compared to the true states. 
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1 Introduction 



Temperature and water vapor soundings from satellites has a history dat- 
ing back to the early 1970s with the NIMBUS series of operational weather 
satellites (e.g. Wick, 1971). The next generation of instruments improved 
the horizontal and vertical resolution of the soundings, in particular the in- 
struments that comprise the TOVS (TIROS Operational Vertical Sounder; 
Smith et al., 1979) and the more recent ATOVS (Advanced TIROS Oper- 
ational Vertical Sounder; Kidwell, 1986). The latter consists of the AMSU 
(Advanced Microwave Sounding Unit) and HIRS (High Resolution Infrared 
Sounder) and provides soundings with an accuracy of about 2. OK for the 
temperature at 1-km vertical resolution and below 6. OK for the dewpoint 
temperature at 2-km vertical resolution (Li, 2000). However, in order to 
make further advancements, the numerical weather prediction and climate 
monitoring communities required improvements in both accuracy for tem- 
perature (< IK) and for humidity (< 10%) in the troposphere (World Me- 
teorological Organization, 1998). It became apparent that to achieve these 
accuracies a new generation of instruments known now as hyperspectral in- 
frared sounders were needed. Smith (1991) gives a detailed overview of the 
evolution of satellite sounding up to the hyperspectral sounders we have to- 
day. AIRS (Atmospheric InfraRed Sounder; Pagano et al. 2003; Aumann et 
al. 2003) and lASI (Infrared Atmospheric Sounding Interferometer; Chalon, 
Cayla and Diebcl 2001; Blumstein et al. 2004), on board of Earth observing 
satellites are the prime examples of these type of instruments with a spectral 
coverage within the 3.62 — 15.5//m region and a spectral resolution (A/AA) 
higher than 1000 which can achieve the required accuracies in temperature 
and humidity retrievals in the troposphere (Smith 1991). 

There are currently three types of methods generally used to retrieve 
temperature and humidity profiles from these type of instruments: linear 
regression methods usually based on Empirical Orthogonal Functions (e.g. 
Smith and Woolf, 1975; Zhou, 2002), neural networks (e.g. Blackwell, 2005) 
and model inversion methods (Twomcy, 1977), also known as Baycsian atmo- 
spheric flux inversion (Michalak, 2005), optimal estimation (Rodgcrs, 1976), 
physical retrievals (Li, 2000; Susskind, 2003). The latter type of methods 
can vary in several details such as the choice of measurement error covari- 
ance matrix, constraints applied, etc. Whenever one of these methods makes 
use of synthetic radiances generated from a radiation transfer model, some 
inversion parameters must be fine tuned to match the retrieval method to 
the real world measurements. In this paper, we will deal only with the model 
inversion method which is commonly known as optimal estimation (Rodgers, 
2000) . Its particularity is that the constraints are based on an a-priori state 
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of the atmosphere and its associated background covariance matrix. The 
same type of analytical study as the one presented here can be made on the 
EOF linear regression method when it is trained with synthetic data (Calbet 
and Schliissel 2006) or on any of the other methods as long as they can be 
described analytically. 

Methods to retrieve temperature and humidity profiles from these kind 
of instruments have been developed over the years. Modern methods can 
perform retrievals over clear, cloudy, land or ocean scenes (Susskind, 2003; 
Zhou, 2005). Despite the significat abundance of non-clear cases, the physical 
state parameters are better known for clear over ocean scenes, in particular 
surface emissivity and cloud properties. In fact, these cases provide the best 
retrieval statistics (e.g. Susskind 2003). Because of this, the calibration 
explained in this paper is preferably best performed on these cases. This fine 
tuning can later be extrapolated to retrieve profiles for any kind of scene. 
Also, to prove that the fine tuning derived here is optimal we need to verify 
it in practice with the best possible cases available. This is the reason why 
in this paper we will deal mainly with clear sky over ocean scenes. 

The first and most critical tuning step is to adjust the numerical mod- 
elling of the atmosphere to the real world. This can be done by fine tuning 
the radiative transfer model to fit the measurements (Strow, 2006). This pro- 
cedure is usually not enough to obtain un-biased retrievals and it is usually 
necessary to apply bias corrections to the radiances or brightness tempera- 
ture (Li, 2000; Susskind, 2006). The bias corrections are usually obtained 
from the measurements by calculating the average of the difference between 
the real observed spectra and the calculated spectra obtained from some 
collocated calibration dataset of atmospheric states and a radiative transfer 
model. In a later retrieval step, measured (or modeled) spectra must be bias 
corrected with the above calculated average. In this paper we will provide 
an analytical justification to use this method and we will also show that it is 
optimal. 

The second set of elements to be tuned in the retrieval process are the 
various parameters used in the optimal estimation: measurement error co- 
variance matrix, a-priori state and a-priori error covariance matrix. For 
the retrieval method to work precisely we need to match with a relatively 
high degree of accuracy these elements of the retrievals to the real atmo- 
spheric system. The a-priori parameters are usually determined from some 
climatology or numerical weather model fields which are representative of 
the atmospheric states being retrieved. Contrary to the bias corrections, the 
measurement error covariance matrix is not usually derived from the mea- 
surements, but rather taken from the instrument noise and/or adding to it 
some estimation of the radiative transfer model error (e.g. Susskind, 2003; 
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Rodgers, 1990; Eyre 1990). The reason for this is to maintain the mea- 
surement error covariance as small as possible to obtain some "ideal best" 
retrieval, relying very much on the assumptions made and the validity of the 
modelling of the atmosphere. In this paper we will abandon this hypothetical 
concept of "ideal best" retrievals and take a more pragmatic approach. We 
will aim to reproduce as accurately as possible the states of the atmosphere 
as measured by some alternative instrument, typically either radiosondes or 
Numerical Weather Prediction (NWP) analyses. These measurements will be 
denominated calibration or validation datasct indistinctively throughout this 
paper. By minimizing the standard deviation between the retrievals and the 
validation dataset of atmospheric states we will demonstrate analytically that 
there exists an optimal measurement error covariance matrix for a particular 
validation datasct. This measurement error covariance matrix is precisely 
the one obtained from the difference between the real observed spectra and 
the calculated spectra obtained from the calibration dataset of atmospheric 
states and the radiative transfer model. Since this matrix will also contain 
representativeness and accuracy errors of the calibration dataset of atmo- 
spheric states, it will depart from an hypothetical "ideal best" measurement 
error. We will show that this is normally not a problem in the retrievals 
since it is safer to overestimate (Eyre, 1990) the measurement error of the 
retrievals with respect to some absolute truth, as it certainly happens with 
the method presented here, than to underestimate it, as it can potentially 
happen by using the above mentioned methods. 

In the field of trace gases retrievals, Michalak et al. (2005) are also 
deriving the measurement error covariance matrix from the measurements 
themselves. The technique consists of maximizing the likelihood of the mea- 
surement covariance given a radiative transfer model, an a-priori state and 
a given set of measurements by applying the Bayes' rule. The maximizing 
solution is quite complex analytically and it has to be solved numerically 
with an iterative process to obtain the measurement error covariance. The 
spirit of the method presented here is very similar to the one from Michalak 
et al. (2005) but minimizing the statistics of the retrievals with respect to 
the validation dataset of atmospheric states, therefore obtaining the optimal 
retrievals. They are optimal in the sense that they give the minimum bias 
and standard deviations when compared to the validation dataset. Being 
a simple method, the optimal measurement error covariance matrix can be 
derived analytically and calculated with real data in a straight forward way. 
The method offers an objective methodology for populating the measurement 
error covariance of the optimal estimation. 

The fact that the solution is derived analytically gives an important in- 
sight into what elements are affecting the resulting statistics of the retrievals 
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when compared with the vahdation dataset of atmospheric states. We can 
even see how the retrieval error behaves when we use different measurement 
error covariance matrices. Other behaviors of the retrieval system could be 
further studied, like for example what are the consequences of using another 
optimal criteria different than minimizing the statistics of the retrievals, etc. 

In Section [2] the method and the underlying assumptions are explained. 
The analytical results and proof of this method being optimal are explained 
in Section |3] and shown in Appendix |X1 The tools used to apply the method 
are explained in Section HI Results of this method applied to real world data 
are shown in Section [51 Finally in Section [6] we discuss the method and 
results. 

2 Best Parameter Determination Method 
2.1 General Assumptions 

There are some underlying assumptions when applying this method which 
are worth mentioning. They are directly related to the goals we are pursuing 
with the retrievals. 

1. The ultimate goal are the retrievals themselves and to have them as 
accurate as possible. This also implies that we want to validate the 
retrievals with an alternative measurement of the state of the atmo- 
sphere. 

2. We recognize that the current modelling of the atmosphere is not accu- 
rate enough to provide some "ideal best" retrievals, but rather that we 
will need to calibrate the whole retrieval system with some calibration 
dataset of atmospheric states. 

3. We will assume that the scene under observation is measured only once 
from space, although more than one instrument or channel can be used. 
In this paper we will use lASI measurements. See Toohey and Strong 
(2007) for an interesting discussion on cross calibration of different 
platforms. 

4. We have only one alternative type of measurement of the atmospheric 
state which will constitute our calibration/validation dataset. For ex- 
ample, in this paper we will use NWP analyses fields to calibrate and 
optimize the validation of the retrievals. This is in contrast to using 
more than one source of atmospheric information, like combining NWP 
and radiosonde data to fine tune and optimize the retrieval parameters. 



5 



Dealing with two different sources of atmospheric measurements for fine 
tuning will not be dealt with in this paper. Even so, we can still per- 
form the exercise of validating the retrievals with another source of 
atmospheric information as is shown in Section 15.41 with radiosondes. 

2.2 Fine Tuning and Retrieval 

The method used can be divided in two steps, the fine tuning one and the 
retrieval proper one. These in turn can be broken down in the following 
substeps, 

1. Tuning step. 

(a) In the tuning step the measurements, radiances or brightness tem- 
perature, from the hyperspectral instrument are obtained. These 
will be referred to as observations (OBS). 

(b) The next step is to find the co-located calibration dataset of the 
atmospheric state vector corresponding to the same scene as the 
lASI observation. We then calculate the spectra corresponding 
to this atmospheric state vector using a radiative transfer model. 
These will be referred to as calculations (CALC). 

(c) We then calculate the difference between observations and calcu- 
lations (OBS - CALC) for many different scenes. 

(d) The last step is to get the statistics of this collection of OBS - 
CALC. In particular the mean (bias) and the covariance matrix. 

(e) Optimal estimation is designed to work with data that has Gaus- 
sian noise. This requirement has to be fulfilled by the OBS-CALC 
difference. This working hypothesis should be verified by checking 
the OBS-CALC histograms for each channel or measurement. 

2. Retrieval step. 

(a) In the retrieval step the measured radiances or brightness tem- 
peratures are corrected with the bias calculated in the fine tuning 
step. 

(b) We then use the OBS-CALC covariance obtained in the fine tun- 
ing step directly as the measurement error covariance matrix of 
the optimal estimation. 
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(c) The background state vector and its covariance matrix can be 
calculated from any source as long as their statistics are similar 
to the ones from the real atmosphere. In this particular example 
they have been obtained from climatology. 

3 Best parameter determination for the op- 
timal estimation 

3.1 Bias corrections 

In Appendix |X] we give the analytical proof that the fine tuning method 
described here is the optimal one. It is derived for the linear method but can 
also be applied to the non-linear case if the first guess is close enough to the 
final result. 

The biases in the retrievals are given by Eq. [181 which we replicate here, 



K^S-\^^^) + S-\-E-^,)\ , (1) 

where xr is the retrieved atmospheric state, Xy is the atmospheric state 
from the calibration or validation dataset, K is the Jacobian from the radia- 
tive transfer, Sa is the background covariance matrix, is the measurement 
error covariance matrix, yo is the observed (OBS) spectrum, i/c is the cal- 
culated (CALC) spectrum and Xa is the background a-priori atmospheric 
state. We can see from this expression that the bias comes from two sources. 
The first source is the OBS - CALC bias in the spectra (?/o — Vc)- The sec- 
ond one comes from the difference between the background a-priori and the 
validation dataset atmospheric state. The latter error should be small if the 
information content of the radiance spectra is high as is the case for lASI. In 
order to eliminate these biases in the retrievals, the spectral measurements 
should be bias corrected according to Eq. [T3 

¥c = To^ (2) 

which is effectively an OBS-CALC bias correction. Also the a-priori 

state, which is a constant in the retrievals, should match the calibration 
dataset states average (Eq. [2UI) . 

Xa X^. (3) 



7 



3.2 Measurement and background error covariances 

In an ideal or simulated world the measurement error covariance matrix, S^^i, 
is quite accurately defined as 

Se,i = iyi-F,{xt){yi-F,{xt)r, (4) 

where i/i is an idealised instrument spectrum, Fp represents a perfect ra- 
diative transfer model and Xt is the true atmospheric state of the atmosphere. 
However, in the real world the instrument does not behave ideally, the ra- 
diative transfer model is not perfect and the true atmospheric state can only 
be approximated by measurements. This in turn implies that there is no 
practical way to derive the ideal S^^i. A practical alternative solution is to 
estimate it from simultaneous measurements of the atmospheric state and 
spectra. 



S, = {yo-F{x,){y,-F{x,))^, (5) 

where yo is the observed spectrum, F represents the radiative model used 
and Xy is the measurement of the state of the atmosphere. Obviuosly, the 
better each of the components that go into this equation are, the better the 
approximation of S^^i will be. Hopefully the instrument should be well be- 
haved, the radiative transfer model should reproduce the radiation properly 
and the measured atmospheric states should be as representative of the true 
state as possible. 

In what follows we will show that this latter is actually the one 
that minimizes the errors of the retrievals when compared to the valida- 
tion dataset. For practical purposes, this value is a good estimation of the 
ideal covariance, S^^i, as will be shown in later sections with real world exam- 
ples and by showing below that it is better to overestimate the measurement 
errors than to underestimate them in the retrievals. 

The covariance of the retrieval error is shown in Eq. [221 To see what effect 
the different values of the measurement error has on a particular retrieval 
system, we can simplify the expression to just one measurement and one 
retrieved variable. This expression will not show in full detail how the actual 
retrieval with many variables behaves, and it is just illustrative. 

We can now plot the retrieval error as a function of for a given set of 
parameters. This is shown in Fig. [1] We can see clearly that the retrieval 
error increases much more rapidly when we underestimate the measurement 
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error than when we overestimate it. Assuming, just for this argument, that 
is actually the absolute true state of the atmosphere and not a calibration 

dataset as in the rest of this paper, we can see that for practical purposes it 

is "safer" to overestimate the measurement error than to underestimate it. 
The measurement error covariance matrix, S'^, that minimizes the errors 

of the retrievals with respect to the validation dataset is found analytically 

to be (Eq. |2S]), 



which is exactly the OBS-CALC covariance matrix. To minimize the result- 
ing retrieval errors, the a-priori covariance matrix, Sa-, should satisfy (Eq. 



This expression, together with Eq. [3l basically states that the a-priori co- 
variance matrix should match the covariance matrix of the validation atmo- 
spheric states. 

One important aspect of this analytical proof is that it can be easily modi- 
fied to be used with other retrieval methods or even for validation parameters 
other than the average or standard deviation. 

4 Practical Example 

4.1 lASI Infrared Hyperspectral Measurements 

The real world measurements come from the lASI instrument. lASI is a 
hyperspectral resolution infrared sounder on board of the polar orbiting series 
of Metop satellites that forms the EUMETSAT Polar System (EPS). Metop- 
A, the first of three satellites of the series was launched successfully on 19 
October 2006, from the Baikonur Cosmodrome in Kazakhstan. lASI is a 
Michelson interferometer measuring between 3.62 and 15.5 microns with a 
spectral resolution of 0.5 cm~^ after apodisation. The spatial resolution is of 
12 km at nadir. 

4.2 Scene Selection 

The scenes observed by lASI were selected for the fine tuning and retrieval 
step as clear sky over ocean at nighttime with latitudes equatorward of 50°. 
The reason for this is to keep to a minimum unknown effects which might 
show up, like for example, unknown surface emissivity over land, cloud prop- 
erties, etc. The selection criteria to declare a certain scene cloudy or clear 
in the fine tuning is very critical. If a small percentage of the scenes are 
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cloud contaminated, this will lead to a bigger than desired bias in the final 
validation of the retrievals. In this paper the clear scenes selection method 
is the one followed by Lutz (2002, 2003) and is shown tabulated in Table [H 
A total of 5308 scenes have been selected around midnight and noon on the 
days of 10, 11, 17, 18, 19, 27, 28 and 29 of April 2007. They were selected 
plus or minus one hour from midnight or noon to have an NWP analysis 
field close enough in time to the lASI observations. This sample was split in 
two, a first one of 5042 scenes to calculate the fine tuning coefficients and the 
rest 266 to be validated against NWP analyses fields. Although the scenes 
used to validate the retrievals are also clear sky over ocean ones, the same 
fine tuning could be used on any other type of scene. An example of this 
is shown in Section 15.41 where the retrievals are compared with co-located 
radiosondes. 

4.3 Radiative Transfer Model 

The radiative transfer model used is RTIASI 4.1 (Matricardi and Saunders 
1999). This fast model provides both direct radiances or brightness tempera- 
tures for each lASI channel and their corresponding Jacobians. RTIASI also 
has a built-in model of surface emissivity which we have used in practice. 

4.4 Optimal Estimation Retrievals 

The retrieval method used is the non-linear optimal estimation one as ex- 
plained in Rodgers (2000). The technique is applied in brightness tempera- 
ture space. One of the pre-requisites to apply this method is that the errors 
are Gaussian. Although the instrument error is Gaussian only in radiance 
space, the global error covariance matrix from OBS-CALC in brightness tem- 
perature space is in fact also Gaussian. Indeed, since the OBS-CALC error 
covariance matrix includes, besides instrument error, also radiative model er- 
rors and NWP errors, the overall effect is a Gaussian error in brightness tem- 
perature space. This can be verified in Fig. [2] for channel 3577 (1539 cm~^) 
where the histogram of the OBS-CALC brightness temperature has been 
plotted. As a counter-example and for illustrative purposes, we also show 
the histogram for channel 5800 (2094.75 cm~^) in Fig. |3l which clearly de- 
viates from a Gaussian function. This anomaly comes from an incorrect CO 
input profile to the radiative transfer model. To solve this problem we have 
to either discard this channel, which is the solution adopted in this paper, or 
try to introduce a more realistic CO profile. 

Optimal estimation retrievals are performed on the temperature and wa- 
ter vapor profiles and skin temperature. First guess estimates come from 
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a previous EOF linear regression retrieval (Calbet and Schliissel, 2006) of 
ozone, temperature and water vapor profiles and skin temperature. 

The channels used in the retrievals are the ones with wavenumbers smaller 
than 1900 cm~^, except the ones on the ozone band. The reasons for avoiding 
the shortwavelength region is that it is difficult to model daytime radiation 
effects, the instrument noise is high and there are absorption lines of some 
trace gases from which the atmospheric profiles are difficult to know (e.g. 
CO). The ozone band is not used because the ozone profile is not retrieved 
in the optimal estimation. 

Brightness temperatures are bias corrected with the OBS-CALC obtained 
from the fine tuning step. They are then used by the optimal estimation 
retrieval. 

The measurement error covariance matrix of the optimal estimation is 
the square of the standard deviation of OBS-CALC. Although the optimal 
error covariance matrix is actually the full OBS-CALC covariance matrix 
(Eq. [261) . it has been verified by our own experience that only the diagonal 
(i.e. square of the standard deviation) is actually needed in these particular 
retrieval exercises. This slightly simplifies the procedure. 

The atmospheric state vectors used to calculate the a-priori parameters 
(a-priori state vector and a-priori covariance matrix) are a modified sub- 
set of the Chevallier profiles (Chevallier 2002). These profiles constitute 
a representative sample of the atmosphere obtained from the 40-year re- 
analysis project of the European Centre for Medium-Range Weather Fore- 
casts (ECMWF). 

The non-linear optimal estimation method is solved iteratively using a 
minimization Levenberg-Marquardt algorithm. The iterations are finished 
when the cost function does not decrease significantly anymore. Note that 
a consequence of this is finalizing the retrievals with brightness temperature 
residuals well below the 1 — a level of the measurement error covariance 
matrix, S^- See Section [6] for a more in depth discussion. 

4.5 Calibration/ Validation Dataset of Atmospheric States 

The reference states of the atmosphere will be the NWP analyses from 
ECMWF. They are co-located by choosing the atmospheric profile of the 
nearest grid point of the ECMWF analysis to the lASI field of view. They 
also are at most only one hour apart from the lASI measurement. 
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4.6 Instrument Noise 



Instrument noise does not have a Gaussian behavior in brightness tempera- 
ture space, which is what is required by optimal estimation. Nevertheless, for 
the purpose of comparing with the optimal error covariance matrix (OBS- 
CALC), retrievals were done using instrument noise as the sole contribution 
to the measurement error covariance matrix. In these cases, brightness tem- 
perature instrument noise was calculated based on its measured brightness 
temperature for each lASI field of view. 

5 Practical Example Results 

5.1 Best Parameter Determination Method Results 

The OBS-CALC statistics for the 5042 profiles are shown in Fig. H] and 
[5l Fig. m shows the bias for each of the lASI wavelengths. The standard 
deviation of OBS-CALC as a function of lASI wavelegnth is shown in Fig. 
[5l For comparison purposes, the instrument noise in brightness temperature 
space for one randomly chosen lASI spectrum is also shown in Fig. O We 
can see how the total error in some regions is much higher than the instru- 
ment noise. In those particular channels where this is the case, the error 
contribution from the radiative transfer modelling or the representativeness 
of ECMWF analyses is much higher than the instrument noise. Note that 
since we are stopping the iterations of the optimal estimation algorithm when 
the cost function does not descend significantly, the final brightness temper- 
ature residuals are well below the values of the measurement error covariance 
matrix, S^. See Section [6] for a more in depth discussion. 

5.2 Effects of Different Measurement Error Covariance 
Matrices 

In order to illustrate that the OBS-CALC covariance matrix is effectively the 
optimum one to use for the retrievals, three different error covariance matrices 
have been applied: the optimum one (OBS-CALC), a constant standard 
deviation of 2K and using only the instrument noise. The retrieval technique 
is the optimal estimation explained in Section |2] for all three experiments. 
Retrievals where made on the 266 measured lASI fields of view (which are 
independent of the 5042 scenes used for fine tuning). In this case non-polar, 
clear air, nighttime over the ocean retrievals were performed (see Section 
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A comparison of all three methods (OBS-CALC, 2K and instrument noise 
as measurement error covariance matrices) can be seen in Fig. O As was 
expected, optimum covariance matrix (OBS-CALC) offers the best retrievals 
within the statistical noise of the comparison. 

5.3 Statistics and Examples of Optimum Retrievals 

A few examples of retrievals on non-polar, nighttime, clear sky, over the 
ocean scenes using the optimal error covariance matrix are shown. In Fig. 
[7] a typical lASI retrieval is shown together with the co-located ECMWF 
atmospheric profile. There is a low level inversion that is clearly retrieved in 
this example. Also the humidity profile is similar to the ECMWF analysis. 
In Fig. [S] we have a flatter temperature profile, which is also relatively well 
retrieved, as well as the humidity profile. In Fig. |9]we see how a strong mid 
level inversion is also reproduced by the retrieval even with high humidity at 
lower levels. In Fig. [10] a humidity maximum is well reproduced, this profile 
also has a strong inversion near the surface. 

The global statistics of these 266 cases when using the optimal error 
covariance matrix is shown in Fig. [6] as a solid line (OBS-CALC). The 
lASI retrieval accuracy is between 0.9 and 1.9 K in temperature and below 
6.5 K in humidity dew point temperature in the troposphere. Note that 
these statistics have been computed directly on RTIASI-4.1 pressure level 
grids without smoothing with the averaging kernels. This implies a vertical 
spacing between levels from 300 to 400 m in the troposphere. 

5.4 Comparison with Radiosondes 

Although the retrieval parameters have been fine tuned to ECMWF analy- 
ses, they also compare well with co-located radiosondes. In Fig. [TTl [12] and 
[T3] we show three lASI retrievals together with their co-located radiosondes 
launched five minutes before overpass time from campaign data obtained 
at Lindenberg. They were performed in clear sky situations at night and 
daytime. It can be seen that the retrievals reproduce particular interesting 
features of the atmosphere like low level temperature inversions, levels of 
maximum humidity and the tropopause. For illustrative purposes, the re- 
trieval, sonde, first guess (EOF retrieval) and background state for the first 
of these examples is shown in Fig. [TU 
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6 Discussion 



It is normally the case that the radiative transfer modelling of the atmosphere 
does not coincide exactly with the observed infrared spectrum. Although 
these differences may not seem to be very high, they are big enough to 
degrade the retrievals significantly. They can be caused by several reasons 
like not knowing the exact concentration of trace gases, erroneous line shapes 
in the radiative transfer model or non-perfect atmospheric states, just to 
name a few. There are usually two ways to correct for this error. The 
first one of them is to model the atmosphere better by either improving 
the radiative transfer model or by using a more realistic atmospheric state 
vector, like for example improving trace gases profiles. The second is to bias 
correct the observed radiances or brightness temperatures to match them to 
the radiative transfer model ones. 

Usually the errors when validating the retrievals are assumed to come 
from three different sources (Rodgers 1990): instrument errors, radiative 
transfer model errors and inacuracies in the representativeness of the cali- 
bration dataset state vectors (NWP analyses in our case). It is usually not 
simple to disentangle each one of these sources of errors in the retrievals. 
In this paper we have not tried to achieve this, but rather obtain the best 
possible parameters (biases, error covariance matrix and background prop- 
erties) to achieve the optimal retrievals when validated with the validation 
dataset (NWP analyses is the example shown here). In this way, we deal 
with all the errors at once. This will make the method simple but at the 
same time powerful by providing the best possible retrievals when compared 
with one single source of validation dataset of atmospheric states. The price 
we have to pay is that the retrievals are not the best when compared with 
some "ideal" absolute reality of the atmosphere because we will be overes- 
timating the measurement error by including an undesired source of error, 
the one from NWP analyses in this case. In particular, by using more con- 
servative values for the measurement error covariance matrix the potential 
high vertical resolution of lASI retrievals might be compromised. It is clear 
that this method will be useful when the errors of the calibration dataset of 
atmospheric states compared to the absolute real ones are small enough for 
our purposes, as could be the case here with temperature and water vapor 
profiles coming from NWP analyses. In any case, as we saw in Fig. [H it is 
generally safer to overestimate the measurement error, as we are doing with 
this method, than to underestimate it. 

Note that we are using a calibration and validation dataset that does not 
represent the true atmospheric states perfectly (ECMWF analyses), which 
in turn gives what could be regarded as an oversized measurement error 
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covariance matrix as shown in Fig. O Despite of this, the retrievals do not 
seem to be extremely penalised in the vertical resolution with respect to 
what could be expected. This can be seen in Fig. [HI where a very low level 
temperature inversion is correctly reproduced. The reason for this is that 
we are stopping the iterations of the optimal estimation algorithm when the 
cost function does not descend significantly, which in practice means that the 
final brightness temperature residuals of the spectra are usually well below 
the values of the measurement error covariance matrix, 5*^. More than the 
absolute values of S^, the important parameters to be considered here are the 
relative amounts within the matrix, which is what effectively goes into the 
cost function in the optimal estimation. 

One drawback of this technique is that we can only use one source of 
atmospheric knowledge as the calibration state vector. It would be advan- 
tageous to extend this technique in such a way that more than one source 
of measurements could be used, for example, using NWP analyses and ra- 
diosondes at the same time. 

A direct consequence of the analytical solution is that there is one and 
only one measurement error covariance matrix that is the optimal one for 
the validation dataset. If this covariance is modified to better match some 
other validation dataset or because we feel a lower value would work better 
for the "real" atmospheric states, we will have to settle with a degradation 
of the retrieval statistics with respect to the first validation dataset. 

The retrievals would be even closer to the real atmospheric states if we 
used a better calibration dataset, like for example radiosondes. Unfortu- 
nately, it is difficult to obtain lASI co-located radiondes and there are not 
enough of them to make a statistically signicant sample. 

This method is the optimal one in the sense of providing the smallest bias 
and standard deviation in the validation of the retrievals. Other parameters 
different than these two could be devised to characterize the goodness of the 
retrievals. In this case, the analytical study could be modified to use these 
new parameters. 

The method has been designed to make optimal retrievals. It remains to 
be seen whether this same or other kind of similar analytical study would 
also be useful for assimilation in NWP models. In this case, the processing 
chain is much longer and does not stop in the retrievals but extends much 
further up to the forecasts. 
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A Analytical Proof of the Best Parameter 
Determination Method 



In this appendix we prove that the optimal bias corrections and error co- 
variance matrix to be used in the retrievals are the OBS -CALC mean and 
covariance. We will show this for the linearized forward radiative transfer 
model. 

The retrieval method consists in minimizing a cost function, J, with re- 
spect to x' . The cost function can be explicitly written as, 

J={y'- F{x')rS:\y' - F{x')) + {x' - x'^S'^x' - <). (9) 

Here we have used the usual matrix notation similar to that from Rodgers 
(2000), being x' the atmospheric state, F the forward model, y' the hyper- 
spectral measurements, the measurement error covariance matrix used in 
the retrieval, Sa the a-priori covariance matrix and x'^ the a-priori atmo- 
spheric state. 

This complex non-linear problem is usually linearized by expanding the 
forward model into a Fourier series around a reference point x[, which in 
general will be different from the a-priori background state x'^, 

y'c^y: + K{x'-x'J, (10) 

where K is the Jacobian of the forward model F. We will define x and 
y as the departures of the atmospheric states and measurements from the 
reference point x'^ and y'^ respectively, 

y = y'-yi, (n) 

x = x'-x',. (12) 
After the linearisation the cost function becomes, 

J = (y - KxfSr\y - Kx) + {x - XaYS-\x - x„). (13) 

In the retrieval process we try to minimize this function by making its 
derivative equal to zero, 

d T 

— = -K^S:\y-Kx) + S-\x-x,) + [f, (14) 

where [ ]^ denotes the transpose of the rest of the right hand side of the 
equation. Solving for x we obtain the familiar retrieval expression. 
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XR = {K'^Sr'K + S^'Y' {k^S^'vo + S^'xa) , (15) 

where xr stands for the retrieved atmospheric state and we have sub- 
stituted yo for y to stress that this is the observed spectrum. The reason 
for this is to differentiate this spectrum from the calculated one, yc- The 
latter is obtained by applying the radiative transfer model (K) to the cali- 
bration/validation dataset of atmospheric states, Xy. 

We now proceed to calculate the bias of the retrievals by comparing with 
the validation dataset state of the atmosphere (NWP analyses for example), 

Xy, 



Xr — Xy — 



{k^S-'K + S-') ' {K^S-'y, + S-'xa) - Xy. 



(16) 



If we now multiply the Xy term by {K^S-^K + S-^)-\K^S-^K + S'^), 
rearranging terms and taking into account that Kxy is what we have called 
the calculated spectrum, y^., we obtain. 



XR Xy 



K'^S^ ^{yo - yc) + ^{xa - Xy) 



(17) 



By taking the expected value we obtain the final expression for the bias. 



K'^S~^{yo - yc) + S~^{xa - Xy) 



(18) 



We can see from this expression that the bias comes from two sources. 
The first source is the OBS - CALC bias in the spectra (?/o — ?/c) ■ The sec- 
ond one comes from the difference between the background a-priori and the 
calibration atmospheric state. In order to minimize the bias in the retrievals 
we should bias correct the observations with the OBS - CALC average, such 
that in the end. 



Vc = ya- 



rn 



Also the background state should be equal to the average atmospheric 
states being retrieved 
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Xa = X^. (20) 

The error of the retrievals can be measured with the covariance between 
the retrieved and the vahdation atmospheric profiles, 

Cov{xr- Xy) = {xr- x^){xr- x^)'^. (21) 

If we now include the atmospheric profile difference from Eq. [T7| we 
obtain, 



COv{XR - Xy) = 

K'^S~^{yo - Vc) + S^^{xa - x„) 
{yo-ycfS:^K + {xa-x,fS~^ 

K^S-^K + 



(22) 



The optimal retrieval parameters, S^^, S^^ and be calculated by 

taking the derivative of this covariance with respect to S^^ and making it 
equal to zero. 



-1 



dCoY{xR — Xy] 

K'^S^^ijJo - Vc) + S-^^i^Xa - X^) 
iVo - VcYS^^K + {Xa - X^y 



-1 



iVo - VcYS'^K + {Xa - X^Y 

+ [ 



(23) 
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where [ Y denotes the transpose of the rest of the right hand side of the 
equation and the symbol ® represents the tensor product of two matrices in 
the sense that each element of the four dimensional tensor product is, 

® = A^^'-r (24) 

Note that the since the covariance matrices are symmetric, [5*7^]"^ = S^^ . 
The same applies for S~^. 

Rearranging terms and averaging over many cases we are left with. 



dCav{xR — 



-K 



dS-^ 

-1 



K 



T 



-1 



T c-1 



K'S. 



-K 
-K 
-K 



{yo-yc){yo-ycYS^ 
+{yo - yc)iyo - ycYS'^K 



K^S:'K + S;\xa - x,){yo - ycYS^'K 



K K -\- Xy)[Xa X'uY^a 



T C-1 



(25) 



We will now show that this derivative is zero if we set, 



Se = {yo-yc){yo-ycY (26) 

Sd (^a Xy^(^Xf2 Xy^ . (2'^) 

Introducing Eqs. |26] and [27] into Eq. [25] we can certify that all terms 
including only x's or y's of Eq. [25]vanish, leaving only x and y cross-product 
terms. 

Let us now analyze the cross-product term and show that it is also zero. 



{yo - yc){xa - x^Y = (yo - yc)xa - yoxl + ycxl (2 
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The first term of tlie riglit liand side is zero wlien we apply tlie bias 
corrections from Eq. [121 If tlie forward model, F, reproduces well enough 
the properties of the real atmosphere, the calculated spectra should have 
similar statistical properties as the observed one in the sense that their cross- 
covariances with the validation atmospheric states should be similar. 



Voxj; ^ y^xj;, (29) 

which implies that the last two terms of the right hand side of Eq. are 
also approximately zero. 

This concludes the proof obtaining as results Eqs. [191 EQl [26] and [271 
Using these solutions we can also calculate the final error retrieval covariance 
matrix to obtain, 

(30) 

which is the usual accepted expression (Rodgers 2000). 
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K = 0.8; S = Cov(x - x,) = 1.0; Cov( y - y ) = 1.0 




1 2 3 



Figure 1: Retrieval error (as the covariance of the difference between the re- 
trieved parameter and the real one, Cov(x/j function 
of the measurement error covariance, S^, as described by Eq. O Other values 
used in this plot are K = 0.8; Sa = Cov{xa — x^) = {xa — x^y = 1.0 and 
Cov(?/o - Vc) = iVo - Vcf = 1.0. 
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Histogram lASI Channel 3577 
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Figure 2: Histogram of brightness temperature difTerence between observed 
and calculated spectra (OBS-CALC) for lASI channel 3577 (1539 cm-^). 
Stepwise line is the measured histogram and smooth line is the fitted Gaus- 
sian. 
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HistoPram lASI Channel 5800 
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BT Obs - BT Calc (K) 

Figure 3: Histogram of brightness temperature difTerence between observed 
and calculated spectra (OBS-CALC) for lASI channel 5800 (2094.75 cm-^). 
Stepwise line is the measured histogram and smooth line is the fitted Gaus- 
sian. 
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lASI bias = OBS - CALC average 
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Figure 4: OBS-CALC bias: observed minus calculated (ECMWF analyses 
+ RTIASI 4.1) brightness temperature averages for all lASI channels. 
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lASI error = OBS - CALC STDV 




I I ■ I I I I I l_ I I I I I L 

1000 1500 2000 2500 

Wavenumber (cm~'^) 



Figure 5: OBS-CALC standard deviation: observed minus calculated 
(ECMWF analyses + RTIASI 4.1) brightness temperature standard devi- 
ations for all lASI channels. Also shown is the instrument noise for one 
randomly chosen atmospheric state. 



28 



Temperature. 266 cases 



Water vapour. 266 cases 




Bias and STDV T (K) Bias and STDV T^^ (K) 



Figure 6: Bias (curves on the left of each graph) and standard deviation 
(curves on the right of each graph) of the retrieval statistics using the diago- 
nal of the instrument noise, a constant of 2 K and OBS-CALC standard de- 
viation as the error covariance matrix in the retrievals. The error covariance 
matrix that provides the optimal retrievals when comparing with ECMWF 
analyses is the OBS-CALC one as the analytical proof of Appendix Rl shows. 
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Lat=36.85° Lon= 124.56°. 2007/04/18 12:23:03 




T. T^,^ (°C) 

Figure 7: Typical lASI retrieval is shown together with the co-located 
ECMWF atmospheric profile. There is a low level inversion that is clearly 
retrieved. 
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Lat=39.29° Lon=-52.49°. 2007/04/19 00:13:11 




T. T^,^ (°C) 

Figure 8: Typical lASI retrieval is shown together with the co-located 
ECMWF atmospheric profile. This one has a flatter temperature profile, 
which is also relatively well retrieved, as well as the humidity profile. 
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Lat=-45.37° Lon=146.35°. 2007/04/28 11:52:39 




Figure 9: Typical lASI retrieval is shown together with the co-located 
ECMWF atmospheric profile. Here we see how a strong low level inver- 
sion is also reproduced by the retrieval even with high humidity at lower 
levels. 
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Lat=38.93° Lon= 122.34°. 2007/04/28 12:16:55 




T. T^,^ (°C) 

Figure 10: Typical lASI retrieval is shown together with the co-located 
ECMWF atmospheric profile. In this figure a humidity maximum is well 
reproduced, this profile also has a strong inversion near the surface. 
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Lindenberg 2007/06/08 19:58:01 
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T. T,,^ (°C) 

Figure 11: lASI retrieval fine tuned for ECMWF analyses compared with co- 
located radiosondes from Lindenberg launched five minutes before overpass 
time. 
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Lindenberg 2007/06/10 09:28:48 
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T. T,,^ (°C) 

Figure 12: lASI retrieval fine tuned for ECMWF analyses compared with co- 
located radiosondes from Lindenberg launched five minutes before overpass 
time. 
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Lindenberg 2007/06/19 19:30:19 
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T. T,,^ (°C) 

Figure 13: lASI retrieval fine tuned for ECMWF analyses compared with co- 
located radiosondes from Lindenberg launched five minutes before overpass 
time. 
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Lindenberg 2007/06/08 19:58:01 




T. T,3, (°C) 

Figure 14: lASI retrieval fine tuned for ECMWF analyses compared with co- 
located radiosondes from Lindenberg launched five minutes before overpass 
time. Also added in this figure are the first guess (EOF retrieval) and back- 
ground state of the optimal estimation retrieval. Lines on the left and right 
side correspond to dew point temperatures and temperatures respectively. 
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Table 1: Scene selec t ion . 

Cloud detection 

-1 K < r(3.9 12m) - T(10.8 /im) > < 3 K 

T(10.8 /im) > 276 K 

T(11.0 /im) > SST^h -2.2 K 

T(4.0 /im) - T(11.0 /im) > 12 K 

T(9.3 /im) - T(11.0 /xm) < K 

r(11.0 /im) - T(12.0 /im) < 1 K 

r(11.0 /im) - T(13.6 /im) > 18 K 

Others 

l^o/ar zenith angle\ < 80° 

\Latitude\ < 50° 

IS'can an(/Ze| < 15° 

T(10.8 /im), for example, is the brightness temperature of an AIRS channel 
that lies in that wavelength (10.8 /im). SST is the sea surface temperature 
derived from ECMWF analysis. 
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